Using decision trees within the tilt intonation model to predict F0 contours
نویسندگان
چکیده
This paper presents an intonation generation system for use in a text-to-speech synthesis system. The intonation generation system uses classification trees to predict intonation event location and regression trees to predict parameters relating to the F0 shape for the predicted events. The decision trees model intonation within the Tilt intonation model, which provides a parameterized description of fundmaental frequency and an intuitive labelling scheme. The event location trees predict an event class (e.g. accent, boundary, none) for each syllable in an utterance based on local and global context (e.g. stress, phrasing, part of speech). The parameter prediction trees then provide the parameterized description of each intonation event based on similar context features. Informal results of the full system are presented together with results for the individual components.
منابع مشابه
An Overview of Prosodic Modelling for Croatian Speech Synthesis
In order to include prosody into the text to speech (TTS) systems prosody knowledge needs to be acquired, represented and incorporated. Two main features of prosody important for modelling prosody for TTS systems are duration and F0 contour. There are various approaches to modelling those features and they can be categorized into three main groups: rule based, statistical and minimalistic. Some...
متن کاملAnalysis and synthesis of intonation using the Tilt model.
This paper introduces the Tilt intonational model and describes how this model can be used to automatically analyze and synthesize intonation. In the model, intonation is represented as a linear sequence of events, which can be pitch accents or boundary tones. Each event is characterized by continuous parameters representing amplitude, duration, and tilt (a measure of the shape of the event). T...
متن کاملParameterization and automatic labeling of Hungarian intonation
In Hungarian intonation research the goal of a common framework developed by Varga (2002; [1]) is to categorize the intonation within the domain of accent groups by character contours. We propose a linear parameterization of a subset of these contours derived from polynomial stylization. These parameters were used to train classification trees and support vector machines for contour prediction....
متن کاملThe tilt intonation model
The tilt intonation model facilitates automatic analysis and synthesis of intonation. The analysis algorithm detects intonational events in F0 contours and parameterises them in terms of the continuously varying Tilt parameters. We describe the analysis system and give results for speaker independent spontaneous dialogue speech. We then describe a synthesis algorithm which can generate F0 conto...
متن کاملModeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree
In the conventional HMM-based TTS, the micro structure of F0 contour is modeled at the state level via a (clustered) decision tree. However, the decision tree based state-level modeling is difficult to capture the long term structure of speech prosody, say at intonation phrase level, due to its greedy search nature and usually sparse training data for covering a large, combinatorial number of u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999